The purpose of this exercise is to help familiarize you with simple ways to explore attributes in various datasets. These skills will help you extract new datasets, connect to tabular data, and qualitatively compare different variables.
With the unprecedented growth in middle Tennessee, the Montgomery County Commission, Stormwater Management, and Health Department are working with the Tennessee Department of Environment and Conservation on an initiative to assess the relationship of brownfield sites to our community and watersheds. Brownfields are locations in communities that pose risks to future land use and development as a result of previous land use practices, particularly commercial and industrial (check out more information on brownfields here: https://www.osha.gov/brownfields/brownfields-qna). They often contain high levels of soil and water contamination, and in some cases pollutants can remain in the ecosystem for decades. Unfortunately, brownfields are often point source locations for ground and surface water contamination. The goal of the initiative is to determine if there are any spatial characteristics of these hazardous locations that have the potential to impact current and future residents of the area. The primary objectives of the initative are to: a) examine the location of brownfields in the county, b) determine which watersheds would be primarily impacted, and c) ascertain if there is a relationship between brownfield sites and any particular demographics in the county. With these three objectives, the county partners may make data-informed decisions to best support and prioritize programs that keep our community and environment safe.
In this exercise you will:
Software specific directions can be found for each step below. Please submit the answer to the questions and your final map by the due date.
The datasets used in this exercise will be found on the Exercise 5 Github Page, previous exercises such as Exercise 2 and Exercise 3, and also from the Tennessee Geographic Information Council. TN GIS maintains a number of datasets in their collections that are useful for projects involving the state of Tennessee.
As with previous exercises you should begin by launching ArcGIS Pro, creating a new blank template, and creating a folder for this specific exercise. You should now see the typical starting screen that greeted you in all of the previous exercises. While some of the data for this exercise you may already have in previous exercise folders, you will start this lab by downloading a dataset from TN GIS. While they maintain a number of quality collections, you will specifically download the statewide watershed coverage (12 digit Hydrologic Unit Code) for Tennessee. This information can be found at the following link: http://www.tngis.org/water.htm. On that page you will find the link for “Download Watershed Coverage”. Click the link, and using the download button in the upper-right corner, save the tn_wbd zip file to your project folder.
Once you have downloaded the file, navigate to the saved location to unzip the file. Within the unzipped folder you will find three additional folders titled:
These are watershed files at varying levels of detail. For hydrologic units you are looking for one with the largest number of digits to get the largest scale data. So for this exercise you will unzip the tn_12dig_huc dataset.
Finally, with that final folder extracted you will find a folder titled hydrologic_units that will contain a shapefile named wbdhu12_a_tn.shp that will be used in this exercise. This is the polygon file representing the 12 digit hydrologic unit codes for the entire state of Tennessee.
Next, you will need the tornado_data file from Exercise 2 and the census_tracts data from Exercise 3. You have a few options for obtaining this data. You can download the data again (but this time to the new project folder), you can navigate to the Exercise 2 and Exercise 3 project folders, respectively, on your computer and copy the zip files to the Exercise 5 project folder, or you can copy the data over using the catalog pane in ArcGIS Pro. While the first two options are relatively straight forward, it is important to learn how to navigate and use the catalog in ArcGIS.
On the View tab, click the Catalog Pane button to open the Catalog Window Pane on the right side of the screen. On the project tab, right-click on the folders option and click "Add Folder Connection. In the resulting window navigate to the folder you would like to connect to and single-click the folder to select it. You don’t want to double-click into the folder. You should see the name of the folder appear at the bottom of the window and the OK button should be available.
Once you have connected to the additional folders you want to use in conjunction with this project you can navigate to them within the Folders link in the Catalog Pane. While you could add data directly from the other folders, the best practice might be to copy the data from one project to another. If for example you plan to alter the data then using it directly from the previous folder would alter it there as well. This could cause future issues when returning to that project. For this exercise you can navigate to the Exercise 2 folder and copy the tornado_data file and paste it in the Exercise 5 folder. This is the safest way to move data such as shapefiles or geodatabases. Because the various data types contain numerous individual files to make up a dataset, catalog will copy/move them all correctly. If you tried to move them using File Explorer and missed one of the files associated with that data it might not work appropriately. So for Exercise 5, you will need to copy the tornado_data and montco_tracts data from exercises two and three respectively.
Finally, you will need to download the Brownfields and Demographics data from the Exercise 5, GitHub Data page. Save both in your Exercise 5 project folder and unzip the brownfields.zip file to access the dataset.
Question No. 1What is the common name of the extracted files? How many are there? What are the various file extensions?The library of Congress has a great description of the various extensions here.
As with previous exercises you should begin by launching QGIS, creating a new empty project, and creating a project folder for this specific exercise. You should now see the typical starting screen that greeted you in all of the previous exercises. While some of the data for this exercise you may already have in previous exercise folders, you will start this lab by downloading a dataset from TN GIS. While they maintain a number of quality collections, you will specifically download the statewide watershed coverage (12 digit Hydrologic Unit Code) for Tennessee. This information can be found at the following link: http://www.tngis.org/water.htm. On that page you will find the link for “Download Watershed Coverage”. Click the link, and using the download button in the upper-right corner, save the tn_wbd zip file to your project folder.
Once you have downloaded the file, navigate to the saved location to unzip the file. Within the unzipped folder you will find three additional folders titled:
These are watershed files at varying levels of detail. For hydrologic units you are looking for one with the largest number of digits to get the largest scale data. So for this exercise you will unzip the tn_12dig_huc dataset.
Finally, with that final folder extracted you will find a folder titled hydrologic_units that will contain a shapefile named wbdhu12_a_tn.shp that will be used in this exercise. This is the polygon file representing the 12 digit hydrologic unit codes for the entire state of Tennessee.
Next, you will need the tornado_data file from Exercise 2 and the census_tracts data from Exercise 3. You have a few options for obtaining this data. You can download the data again (but this time to the new project folder), you can navigate to the Exercise 2 and Exercise 3 project folders, respectively, on your computer and copy the zip files to the Exercise 5 project folder, or you can copy the data over using the browser window in QGIS. While the first two options are relatively straight forward, it is important to be confident navigating and using the browser in QGIS.
If you created a “favorites” folder you will most likely navigate within that location, however, if you haven’t created a favorite folder you will search through your drives for the tornado_data file from Exercise 2. Once you locate the file, right/CRTL click on the file and select Export Layer > To File…. In the resulting window select ESRI Shapefile as the “Format”, for the “File name” click on the browse button and give it a file name and save it to your Exercise 5 project folder. If you check the “Add Saved File to Map” button and click OK the file will be added to your layers.
Repeat this process for the census_tracts data from Exercise 3. While you could add data directly from the other folders, the best practice might be to export the data from one project to another. If for example you plan to alter the data then using it directly from the previous folder would alter it there as well. This could cause future issues when returning to that project. With these two files added to your layers you only need to download the Brownfields and Demographics data from the Exercise 5, GitHub Data page. Save both in your Exercise 5 project folder and unzip the brownfields.zip file to access the dataset.
Question No. 1What is the common name of the extracted files? How many are there? What are the various file extensions?
Before you begin, you will need to open the Ex2 Colab Notebook and insert tocolab after github in the URL to open in the Colab Environment. As you have seen before, R requires various packages to complete certain analyses. In this exercise you will be using tidyverse, OpenStreetMaps, ggfortify, maptools, and rgeos. To install and load the packages we will use the following script:
Blah…blah…blah…
Question No. 1Insert Text Here
The data collected in the previous section requires additional processing so you can reduce the dataset to only the pertinent information for the analyses. In this step you will use additional geoprocessing techniques and data management tools to link two datasets for further examination.
With the data collected you can now add the brownfields, census tracts, tornado_data, and wbdhu12_a_tn (watersheds) data to your project. Although there are a number of ways of isolating data to make derived datasets (e.g. Select > Lasso in Exercise 4, Step 1), in this exercise you will use another tool from the Geoprocessing Toolbox to complete this task. On the View Tab click on the Geoprocessing Toolbox button to open the Geoprocessing pane on the right side of the screen. By navigating through the tools menus you will find Select under Analysis Tools > Extract. With this tool you will write a simple expression to “select” a small portion of the data you need for further analysis. To do this, double-click the Select tool and in the resulting pane input the following parameters:
*The tornado dataset is only being used to obtain a polygon for Montgomery County for the clip process in the next step.
This will add the new montgomery_county shapefile to your contents. You can now remove the tornado dataset because it will no longer be needed. With the polygon of Montgomery County available you can now use the Clip tool like in Exercise 4, Step One to clip the brownfields and watersheds datasets to reduce them to only those within Montgomery County. If you receive a “Datum conflict” warning, for the purposes of this exercise, you can ignore it an continue with the clip. Recall that the Input Features is the data you want to reduce, the Clip Feature is the data you want it to take the shape of, and Output Feature Class is what you are naming the new file. Refer back to Exercise 4, Step One for more information about Clip.
With the new clipped datasets you can remove or just uncheck (in case you want to use them in your final map) the full brownfields and watersheds datasets to reduce clutter. You can also now zoom in closer to view only Montgomery County.
In the final step to prepare the data, you are going to connect a non-spatial data to the census tract dataset. In Step One you downloaded a file titled demographics.csv. This file contains comma-separated values detailing additional demographic data that you need to append to the census tract data. Although the process is relatively straight-forward, there are a number of steps that need to be taken in order to join the data.
First, if you haven’t already, add the demographics.csv file to your table of contents. This can be done from the Catalog Pane or with the “Add Data” button like in previous exercises. Because ArcGIS Pro treats *.csv files as “read only” you need to convert it to a table that can be edited in the software. Now, right-click on the demographics.csv standalone table and go to Data > Export Table. In the resulting window choose the following options:
Before clicking OK, you need to expand the Fields section of the window and click on Tract in the Output Fields column. Then click on the Properties Tab and change the Type field to Text. Then click OK. If you continued without changing the field type, the variable would most likely be treated as a numerical value. If you open the attribute table for any dataset and mouse-over the variable column without clicking a pop-up window will appear detailing the Type and other parameters of the variable. In the census dataset from the previous exercise, the NAME variable is Type: Text (7). The seven in parenthesis means the max number of available characters is seven. So before you export a table it is good practice to make sure the variables match the variables you intend to join or that the variables will be treated in a manner necessary for additional analyses.
The new standalone table should have been added to the Table of Contents. If not you should add it now; the csv table can be removed. Now you can connect the new table to the census tract dataset. Begin by right-clicking on the census data and selecting Joins and Relates > Add Join. In the new Add Join window select the following options (your file names may vary):
For this exercise keep the “Keep All Target Features” button checked and if you receive an warning about an indexing error with the census data you can ignore it for this exercise. Then click the Validate Join button. This will pop-up an new window that will describe the process of checking the two datasets to see if they can be joined. At the bottom of the dialog you should see a line that says there were 39 joins. Close that message and click OK to run the join.
Finally, open the attribute table for the census tracts and scroll to the far right of the table. If the join worked properly you should see a number of additional fields added to the table.
This will provide all of the data and information you need to visualize the data and make comparisons of the watersheds.
Question No. 2How many watersheds cover Montgomery County? Although they have been clipped from their original geometry, which watershed is the largest? Which is the smallest?
With the data collected you can now add the brownfields, census tracts, tornado_data, and wbdhu12_a_tn (watersheds) data to your project. Although there are a number of ways of isolating data to make derived datasets (e.g. Select Features > Select Features by Freehand in Exercise 4, Step 1), in this exercise you will use another tool from Vector Selection in the Processing Toolbox to complete this task.
With this tool you will select only a small portion of the data you need for further analysis. To do this, double-click the “Select”Extract by Attribute" tool and in the resulting window input the following parameters (file names may vary):
Remember that in QGIS you have the ability to either create permanent files or temporary layers. Because you will only be using the Montgomery County dataset to clip files later on, you can decide whether to use the browse button to save the file for future use or just create a temporary file.
This will add the new montgomery_county temporary file (or shapefile if saved) to your layers. You can now remove the tornado dataset because it will no longer be needed. You may also consider renaming it if necessary. With the polygon of Montgomery County available you can now use the Clip tool like in Exercise 4, Step One to clip the brownfields and watersheds datasets to reduce them to only those within Montgomery County. Recall that the Input layer is the data you want to reduce (e.g. brownfields or watersheds), the Overlay layer is the data you want it to take the shape of. Refer back to Exercise 4, Step One for more information about Clip. You should go ahead and use the browse button to save these as permanent files. Be sure to use a naming convention that will allow you to recal what the files are later on (e.g. montco_brownfields).
With the new clipped datasets you can remove or just uncheck (in case you want to use them in your final map) the full brownfields and watersheds datasets to reduce clutter. You can also now zoom in closer to view only Montgomery County.
In the final step to prepare the data, you are going to connect a non-spatial data to the census tract dataset. In Step One you downloaded a file titled demographics.csv. This file contains comma-separated values detailing additional demographic data that you need to append to the census tract data. Although the process is relatively straight-forward, there are a number of steps that need to be taken in order to join the data.
First, if you haven’t already, add the demographics.csv file to your table of contents.
Question No. 2Insert Text Here
Blah…blah…blah…
Question No. 2Insert Text Here
In this step you will need to examine the spatial distribution of brownfields within the watersheds of Montgomery County and make some qualitative interpretations of potentially impacted urban areas.
Examine the spatial distribution of the brownfield throughout the county. The clustering should be relatively apparent and might match up with your knowledge of industrial activities in the various areas of Montgomery County. In order to help quantify the number of brownfields in each watershed you can use a Spatial join to create a count variable for this information. To do this, right-click on the montgomery county watershed dataset and go to Join and Relates > Spatial Join
In the resulting window, select the following parameters (your file names may vary):
You can leave the remaining items blank and click OK.
By examining the attribute table for the new dataset you should see a new variable called Join_Count. This is the number of brownfields that occur within each watershed.
Using the skills you learned in Exercises Two, Three, and Four you can now make a map that shows Montgomery County, the location of brownfields and watersheds in a graduated color scheme by number of brownfields. Remember to include cartographic elements such as legend, scale bar, north arrow, etc. In this visualization you may also want to add a different basemap or inset map that provides additional supporting information.
Question No. 3Which watershed contains the most brownfields?
Blah…blah…blah…
Question No. 3Insert Text Here
Blah…blah…blah…
Question No. 3Insert Text Here
After discussing the results of the previous analysis with your colleagues at County Commission, Stormwater Management, Health Department, and TDEC, they are interested in seeing how the location of brownfields impacts the community. Although the commission districts do not perfectly replicate the census tracts, the County Commissioners and the Health Department want to know if the brownfield sites are directly related to census tracts with large minority populations. They are concerned by a recently published report that states:
“While there is no single way to characterize communities located near our sites, this population is more minority, low income, linguistically isolated, and less likely to have a high school education than the U.S. population as a whole. As a result, these communities may have fewer resources with which to address concerns about their health and environment.”
During these discussions the Health Department would also like to know if the areas with a high number of brownfields have higher populations of children.
Using the skills you learned in this and previous exercises, create a new spatial join between the census tracts and brownfields datasets (for this exercise ignore any datum warning). As with the previous spatial join, you should now have an additional variable labeled “Join_Count” that details the number of brownfields per census tract.
One way you can view two variables at once on a map is to create bivariate symbology.
This creates a grid of colors with an X and Y axis with the following categories:
The other grid cells represent midpoints in the variables. The variables can be selected in the symbology pane where “Field 1” is one variable and “Field 2” is the other variable. So to visualize brownfields and one of the population demographics select one for each field. For “Grid Size” select 3x3. You can select your own color scheme with the drop-down menu. In the fields section below, you can rename the fields (e.g. “join count” means nothing so give it an appropriate name).
With these settings you should have a symbology that shows if a census tract is high or low in either of the particular variables. Test a number of different variable combinations with the count of brownfields and various demographic categories.
Question No. 4Which census tract contains the most brownfields?
Blah…blah…blah…
Question No. 4Insert Text Here
Blah…blah…blah…
Question No. 4Insert Text Here
In the report you provide to the County Commission, Stormwater Management, Health Department, and TDEC please provide the following information:
When complete, send a link to your Colab Notebook or word document with answers to Questions 1-4 and your completed map via email.